Gaussian Mixture Models For Extraction Of Melodic Lines From Audio Recordings

نویسنده

  • Matija Marolt
چکیده

The presented study deals with extraction of melodic line(s) from polyphonic audio recordings. We base our work on the use of expectation maximization algorithm, which is employed in a two-step procedure that finds melodic lines in audio signals. In the first step, EM is used to find regions in the signal with strong and stable pitch (melodic fragments). In the second step, these fragments are grouped into clusters according to their properties (pitch, loudness...). The obtained clusters represent distinct melodic lines. Gaussian Mixture Models, trained with EM are used for clustering. The paper presents the entire process in more detail and gives some initial results.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On Finding Melodic Lines in Audio Recordings

The paper presents our approach to the problem of finding melodic line(s) in polyphonic audio recordings. The approach is composed of two different stages, partially rooted in psychoacoustic theories of music perception: the first stage is dedicated to finding regions with strong and stable pitch (melodic fragments), while in the second stage, these fragments are grouped according to their prop...

متن کامل

Optimizing Melodic Extraction Algorithm for Jazz Guitar Recordings Using Genetic Algorithms

Extraction of the main melody of a musical piece is a preliminary step in the process of transcribing the piece. Automatic melodic extraction is the task of computationally extracting what a human listener would perceive as the main melody of a polyphonic recording. Several melodic extraction systems have been proposed. However, such systems normally require a number of parameters to be manuall...

متن کامل

Speaker Clustering With Neural Networks And Audio Processing

Speaker clustering is the task of differentiating speakers in a recording. In a way, the aim is to answer "who spoke when" in audio recordings. A common method used in industry is feature extraction directly from the recording thanks to MFCC features, and by using well-known techniques such as Gaussian Mixture Models (GMM) and Hidden Markov Models (HMM). In this paper, we studied neural network...

متن کامل

Melodic Pattern Extraction in Large Collections of Music Recordings Using Time Series Mining Techniques

We demonstrate a data-driven unsupervised approach for the discovery of melodic patterns in large collections of Indian art music recordings. The approach first works on single recordings and subsequently searches in the entire music collection. Melodic similarity is based on dynamic time warping. The task being computationally intensive, lower bounding and early abandoning techniques are appli...

متن کامل

Multimedia fusion in automatic extraction of studio speech segments for spoken document retrieval

This paper describes our progress in Cantonese spoken document retrieval. Over 60 hours of Cantonese television news broadcasts have been collected as part of AoE-IT Multimedia Repository. We have also developed the Multimedia Markup Language (MmML) for annotating the multimedia content in terms of anchor/field video frames and audio recordings. The audio tracks are indexed by a Cantonese sylla...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004